Skip to content

ENH/API: offsets funcs now accepts datetime64 #7452

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 14, 2014

Conversation

sinhrks
Copy link
Member

@sinhrks sinhrks commented Jun 13, 2014

Even though CustomBusinessDay.apply can handle np.datetime64, most of other offsets cannot accept datetime64 and raises ApplyTypeError. The fix allows all offsets apply, rollforward and rollback to handle np.datetime64 properly.

import pandas as pd
import numpy as np
t = np.datetime64('2011-01-01 09:00Z')

cday = pd.offsets.CustomBusinessDay()
cday.apply(t)
#2011-01-02 09:00:00

day = pd.offsets.Day()
day.apply(t)
# pandas.tseries.offsets.ApplyTypeError: Unhandled type: datetime64

NOTE: CustomBusinessDay had separate logic for datetime and np.datetime64. Based on the comparison using current master, np.datetime64 logic looks slower. Thus I removed it.

import timeit
setup = """
import pandas as pd
import numpy as np

cday = pd.offsets.CustomBusinessDay()

np_dt64 = [np.datetime64('2014-05-{0:02} 09:00Z'.format(i)) for i in range(1, 31)]
timestamps = [pd.Timestamp('2014-05-{0:02} 09:00Z'.format(i)) for i in range(1, 31)]
"""

t = timeit.Timer('[cday.apply(d) for d in np_dt64]', setup)
print t.timeit(1000)
#1.6253619194
t = timeit.Timer('[cday.apply(d) for d in timestamps]', setup)
print t.timeit(1000)
#0.959406137466

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

hmm their is a vbench for this
can u add some of the other types to it as well (the other offsets)

I dont recall exactly the speed issue wrt to datetime/dagetime64 but their is a closed issue about it - can u zref the issue and make sure that nothing has changed

thxs

@jreback
Copy link
Contributor

jreback commented Jun 13, 2014

xref. #6592

@sinhrks
Copy link
Member Author

sinhrks commented Jun 14, 2014

Following is the vbench result. I understand #6592 added 2 separate logic for datetime and np.datetime64, and both are faster than previous one.

If we compare datetime logic and np.datetime64 logic, datetime logic is faster (thus removed np.datetime64 logic in this PR).

CC @bjonen

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
read_store_table                             |   5.0253 |   4.8374 |   1.0389 |
frame_to_csv_date_formatting                 |  36.4916 |  34.9390 |   1.0444 |
read_store_table_mixed                       |  18.4183 |  17.6213 |   1.0452 |
index_float64_boolean_indexer                |   7.6843 |   7.3157 |   1.0504 |
frame_loc_dups                               |   1.3700 |   1.2767 |   1.0731 |
dtype_infer_uint32                           |   2.6900 |   2.3301 |   1.1545 |
index_str_boolean_series_indexer             |  41.9977 |  35.5417 |   1.1816 |

@hayd
Copy link
Contributor

hayd commented Jun 14, 2014

related to #6318 ?

@jreback
Copy link
Contributor

jreback commented Jun 14, 2014

@sinhrks can you post the benches that relate to this?

@sinhrks
Copy link
Member Author

sinhrks commented Jun 14, 2014

@hayd Looks different, this is for datetime64, not timedelta.
@jreback Here it is.

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
timeseries_custom_bday_apply_dt64            |   0.0366 |   0.0540 |   0.6779 |
timeseries_custom_bmonthend_incr_n           |   0.2480 |   0.2427 |   1.0219 |
timeseries_custom_bday_incr_n                |   0.0547 |   0.0533 |   1.0253 |
timeseries_custom_bmonthend_incr             |   0.2210 |   0.2154 |   1.0262 |
timeseries_custom_bday_apply                 |   0.0337 |   0.0327 |   1.0316 |
timeseries_custom_bday_incr                  |   0.0363 |   0.0350 |   1.0386 |
-------------------------------------------------------------------------------

@jreback jreback added this to the 0.14.1 milestone Jun 14, 2014
jreback added a commit that referenced this pull request Jun 14, 2014
ENH/API: offsets funcs now accepts datetime64
@jreback jreback merged commit 8d209de into pandas-dev:master Jun 14, 2014
@sinhrks sinhrks deleted the offsets_dt64 branch June 14, 2014 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Frequency DateOffsets Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants